Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
BMC Genomics ; 25(1): 299, 2024 Mar 21.
Artigo em Inglês | MEDLINE | ID: mdl-38515031

RESUMO

BACKGROUND: Many studies have been performed to identify various genomic loci and genes associated with the meat quality in pigs. However, the full genetic architecture of the trait still remains unclear in part because of the lack of accurate identification of related structural variations (SVs) which resulted from the shortage of target breeds, the limitations of sequencing data, and the incompleteness of genome assemblies. The recent generation of a new pig breed with superior meat quality, called Nanchukmacdon, and its chromosome-level genome assembly (the NCMD assembly) has provided new opportunities. RESULTS: By applying assembly-based SV calling approaches to various genome assemblies of pigs including Nanchukmacdon, the impact of SVs on meat quality was investigated. Especially, by checking the commonality of SVs with other pig breeds, a total of 13,819 Nanchukmacdon-specific SVs (NSVs) were identified, which have a potential effect on the unique meat quality of Nanchukmacdon. The regulatory potentials of NSVs for the expression of nearby genes were further examined using transcriptome- and epigenome-based analyses in different tissues. CONCLUSIONS: Whole-genome comparisons based on chromosome-level genome assemblies have led to the discovery of SVs affecting meat quality in pigs, and their regulatory potentials were analyzed. The identified NSVs will provide new insights regarding genetic architectures underlying the meat quality in pigs. Finally, this study confirms the utility of chromosome-level genome assemblies and multi-omics analysis to enhance the understanding of unique phenotypes.


Assuntos
Genoma , Genômica , Suínos/genética , Animais , Carne/análise , Fenótipo , Cromossomos
2.
Sci Data ; 10(1): 761, 2023 11 03.
Artigo em Inglês | MEDLINE | ID: mdl-37923776

RESUMO

As plentiful high-quality genome assemblies have been accumulated, reference-guided genome assembly can be a good approach to reconstruct a high-quality assembly. Here, we present a chromosome-level genome assembly of the Korean crossbred pig called Nanchukmacdon (the NCMD assembly) using the reference-guided assembly approach with short and long reads. The NCMD assembly contains 20 chromosome-level scaffolds with a total size of 2.38 Gbp (N50: 138.77 Mbp). Its BUSCO score is 93.1%, which is comparable to the pig reference assembly, and a total of 20,588 protein-coding genes, 8,651 non-coding genes, and 996.14 Mbp of repetitive elements are annotated. The NCMD assembly was also used to close many gaps in the pig reference assembly. This NCMD assembly and annotation provide foundational resources for the genomic analyses of pig and related species.


Assuntos
Cromossomos , Genoma , Sus scrofa , Suínos , Animais , Cromossomos/genética , Genômica , Anotação de Sequência Molecular , República da Coreia , Sus scrofa/genética , Suínos/genética
3.
Comput Struct Biotechnol J ; 21: 444-451, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36618978

RESUMO

Constructing accurate microbial genome assemblies is necessary to understand genetic diversity in microbial genomes and its functional consequences. However, it still remains as a challenging task especially when only short-read sequencing technologies are used. Here, we present a new read-clustering algorithm, called RBRC, for improving de novo microbial genome assembly, by accurately estimating read proximity using multiple reference genomes. The performance of RBRC was confirmed by simulation-based evaluation in terms of assembly contiguity and the number of misassemblies, and was successfully applied to existing fungal and bacterial genomes by improving the quality of the assemblies without using additional sequencing data. RBRC is a very useful read-clustering algorithm that can be used (i) for generating high-quality genome assemblies of microbial strains when genome assemblies of related strains are available, and (ii) for upgrading existing microbial genome assemblies when the generation of additional sequencing data, such as long reads, is difficult.

4.
CNS Neurosci Ther ; 29(4): 1034-1048, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-36575854

RESUMO

BACKGROUND: Alzheimer's disease (AD), the most prevalent form of dementia, affects 6.5 million Americans and over 50 million people globally. Clinical, genetic, and phenotypic studies of dementia provide some insights of the observed progressive neurodegenerative processes, however, the mechanisms underlying AD onset remain enigmatic. AIMS: This paper examines late-onset dementia-related cognitive impairment utilizing neuroimaging-genetics biomarker associations. MATERIALS AND METHODS: The participants, ages 65-85, included 266 healthy controls (HC), 572 volunteers with mild cognitive impairment (MCI), and 188 Alzheimer's disease (AD) patients. Genotype dosage data for AD-associated single nucleotide polymorphisms (SNPs) were extracted from the imputed ADNI genetics archive using sample-major additive coding. Such 29 SNPs were selected, representing a subset of independent SNPs reported to be highly associated with AD in a recent AD meta-GWAS study by Jansen and colleagues. RESULTS: We identified the significant correlations between the 29 genomic markers (GMs) and the 200 neuroimaging markers (NIMs). The odds ratios and relative risks for AD and MCI (relative to HC) were predicted using multinomial linear models. DISCUSSION: In the HC and MCI cohorts, mainly cortical thickness measures were associated with GMs, whereas the AD cohort exhibited different GM-NIM relations. Network patterns within the HC and AD groups were distinct in cortical thickness, volume, and proportion of White to Gray Matter (pct), but not in the MCI cohort. Multinomial linear models of clinical diagnosis showed precisely the specific NIMs and GMs that were most impactful in discriminating between AD and HC, and between MCI and HC. CONCLUSION: This study suggests that advanced analytics provide mechanisms for exploring the interrelations between morphometric indicators and GMs. The findings may facilitate further clinical investigations of phenotypic associations that support deep systematic understanding of AD pathogenesis.


Assuntos
Doença de Alzheimer , Disfunção Cognitiva , Humanos , Idoso , Idoso de 80 Anos ou mais , Doença de Alzheimer/diagnóstico por imagem , Doença de Alzheimer/genética , Encéfalo/diagnóstico por imagem , Encéfalo/patologia , Neuroimagem/métodos , Disfunção Cognitiva/diagnóstico por imagem , Disfunção Cognitiva/genética , Disfunção Cognitiva/complicações , Substância Cinzenta/patologia , Progressão da Doença
5.
Nucleic Acids Res ; 50(W1): W254-W260, 2022 07 05.
Artigo em Inglês | MEDLINE | ID: mdl-35552439

RESUMO

Deep learning has been applied for solving many biological problems, and it has shown outstanding performance. Applying deep learning in research requires knowledge of deep learning theories and programming skills, but researchers have developed diverse deep learning platforms to allow users to build deep learning models without programming. Despite these efforts, it is still difficult for biologists to use deep learning because of limitations of the existing platforms. Therefore, a new platform is necessary that can solve these challenges for biologists. To alleviate this situation, we developed a user-friendly and easy-to-use web application called DLEB (Deep Learning Editor for Biologists) that allows for building deep learning models specialized for biologists. DLEB helps researchers (i) design deep learning models easily and (ii) generate corresponding Python code to run directly in their machines. DLEB provides other useful features for biologists, such as recommending deep learning models for specific learning tasks and data, pre-processing of input biological data, and availability of various template models and example biological datasets for model training. DLEB can serve as a highly valuable platform for easily applying deep learning to solve many important biological problems. DLEB is freely available at http://dleb.konkuk.ac.kr/.


Assuntos
Aprendizado Profundo , Software
6.
Gigascience ; 112022 05 17.
Artigo em Inglês | MEDLINE | ID: mdl-35579554

RESUMO

BACKGROUND: Metagenomic assembly using high-throughput sequencing data is a powerful method to construct microbial genomes in environmental samples without cultivation. However, metagenomic assembly, especially when only short reads are available, is a complex and challenging task because mixed genomes of multiple microorganisms constitute the metagenome. Although long read sequencing technologies have been developed and have begun to be used for metagenomic assembly, many metagenomic studies have been performed based on short reads because the generation of long reads requires higher sequencing cost than short reads. RESULTS: In this study, we present a new method called PLR-GEN. It creates pseudo-long reads from metagenomic short reads based on given reference genome sequences by considering small sequence variations existing in individual genomes of the same or different species. When applied to a mock community data set in the Human Microbiome Project, PLR-GEN dramatically extended short reads in length of 101 bp to pseudo-long reads with N50 of 33 Kbp and 0.4% error rate. The use of these pseudo-long reads generated by PLR-GEN resulted in an obvious improvement of metagenomic assembly in terms of the number of sequences, assembly contiguity, and prediction of species and genes. CONCLUSIONS: PLR-GEN can be used to generate artificial long read sequences without spending extra sequencing cost, thus aiding various studies using metagenomes.


Assuntos
Metagenoma , Microbiota , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Metagenômica/métodos , Microbiota/genética , Análise de Sequência de DNA/métodos
7.
BMC Bioinformatics ; 21(1): 185, 2020 May 12.
Artigo em Inglês | MEDLINE | ID: mdl-32397982

RESUMO

BACKGROUND: Microorganisms are important occupants of many different environments. Identifying the composition of microbes and estimating their abundance promote understanding of interactions of microbes in environmental samples. To understand their environments more deeply, the composition of microorganisms in environmental samples has been studied using metagenomes, which are the collections of genomes of the microorganisms. Although many tools have been developed for taxonomy analysis based on different algorithms, variability of analysis outputs of existing tools from the same input metagenome datasets is the main obstacle for many researchers in this field. RESULTS: Here, we present a novel meta-analysis tool for metagenome taxonomy analysis, called TAMA, by intelligently integrating outputs from three different taxonomy analysis tools. Using an integrated reference database, TAMA performs taxonomy assignment for input metagenome reads based on a meta-score by integrating scores of taxonomy assignment from different taxonomy classification tools. TAMA outperformed existing tools when evaluated using various benchmark datasets. It was also successfully applied to obtain relative species abundance profiles and difference in composition of microorganisms in two types of cheese metagenome and human gut metagenome. CONCLUSION: TAMA can be easily installed and used for metagenome read classification and the prediction of relative species abundance from multiple numbers and types of metagenome read samples. TAMA can be used to more accurately uncover the composition of microorganisms in metagenome samples collected from various environments, especially when the use of a single taxonomy analysis tool is unreliable. TAMA is an open source tool, and can be downloaded at https://github.com/jkimlab/TAMA.


Assuntos
Bactérias/classificação , Classificação/métodos , Metagenoma , Metagenômica/métodos , Bactérias/genética , Bases de Dados Genéticas , Conjuntos de Dados como Assunto , Sequenciamento de Nucleotídeos em Larga Escala , Modelos Genéticos , Filogenia
8.
Curr Protoc Bioinformatics ; 68(1): e88, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31751498

RESUMO

INTER-Species Protein Interaction Analysis (INTERSPIA) is a web application for identifying diverse patterns of protein-protein interactions (PPIs) in different species. Given a set of proteins of interest to the user, INTERSPIA first discovers additional proteins that are functionally associated with the input proteins as well as different or common patterns of PPIs among the proteins in multiple species through a server-side pipeline. Second, it visualizes the dynamics of PPIs in multiple species via an easy-to-use web interface. This article contains a basic protocol describing how to visualize diverse patterns of PPIs of input proteins in multiple species, and how to use them for functional analysis in the web interface. INTERSPIA is freely available at http://bioinfo.konkuk.ac.kr/INTERSPIA/. © 2019 by John Wiley & Sons, Inc. Basic Protocol: Running INTERSPIA using a list of input proteins.


Assuntos
Mapeamento de Interação de Proteínas/métodos , Software , Animais , Bases de Dados de Proteínas , Humanos , Internet , Especificidade da Espécie , Interface Usuário-Computador
9.
PLoS One ; 14(8): e0221858, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31454399

RESUMO

BACKGROUND: Genomic data have become major resources to understand complex mechanisms at fine-scale temporal and spatial resolution in functional and evolutionary genetic studies, including human diseases, such as cancers. Recently, a large number of whole genomes of evolving populations of yeast (Saccharomyces cerevisiae W303 strain) were sequenced in a time-dependent manner to identify temporal evolutionary patterns. For this type of study, a chromosome-level sequence assembly of the strain or population at time zero is required to compare with the genomes derived later. However, there is no fully automated computational approach in experimental evolution studies to establish the chromosome-level genome assembly using unique features of sequencing data. METHODS AND RESULTS: In this study, we developed a new software pipeline, the integrative meta-assembly pipeline (IMAP), to build chromosome-level genome sequence assemblies by generating and combining multiple initial assemblies using three de novo assemblers from short-read sequencing data. We significantly improved the continuity and accuracy of the genome assembly using a large collection of sequencing data and hybrid assembly approaches. We validated our pipeline by generating chromosome-level assemblies of yeast strains W303 and SK1, and compared our results with assemblies built using long-read sequencing and various assembly evaluation metrics. We also constructed chromosome-level sequence assemblies of S. cerevisiae strain Sigma1278b, and three commonly used fungal strains: Aspergillus nidulans A713, Neurospora crassa 73, and Thielavia terrestris CBS 492.74, for which long-read sequencing data are not yet available. Finally, we examined the effect of IMAP parameters, such as reference and resolution, on the quality of the final assembly of the yeast strains W303 and SK1. CONCLUSIONS: We developed a cost-effective pipeline to generate chromosome-level sequence assemblies using only short-read sequencing data. Our pipeline combines the strengths of reference-guided and meta-assembly approaches. Our pipeline is available online at http://github.com/jkimlab/IMAP including a Docker image, as well as a Perl script, to help users install the IMAP package, including several prerequisite programs. Users can use IMAP to easily build the chromosome-level assembly for the genome of their interest.


Assuntos
Análise de Sequência de DNA , Software , Cromossomos Fúngicos , Genoma Fúngico , Anotação de Sequência Molecular , Sintenia/genética
10.
BMC Bioinformatics ; 20(1): 147, 2019 Mar 18.
Artigo em Inglês | MEDLINE | ID: mdl-30885117

RESUMO

BACKGROUND: Thanks to the recent advancements in next-generation sequencing (NGS) technologies, large amount of genomic data, which are short DNA sequences known as reads, has been accumulating. Diverse assemblers have been developed to generate high quality de novo assemblies using the NGS reads, but their output is very different because of algorithmic differences. However, there are not properly structured measures to show the similarity or difference in assemblies. RESULTS: We developed a new measure, called the GMASS score, for comparing two genome assemblies in terms of their structure. The GMASS score was developed based on the distribution pattern of the number and coverage of similar regions between a pair of assemblies. The new measure was able to show structural similarity between assemblies when evaluated by simulated assembly datasets. The application of the GMASS score to compare assemblies in recently published benchmark datasets showed the divergent performance of current assemblers as well as its ability to compare assemblies. CONCLUSION: The GMASS score is a novel measure for representing structural similarity between two assemblies. It will contribute to the understanding of assembly output and developing de novo assemblers.


Assuntos
Genoma Humano , Genômica , Bases de Dados Genéticas , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Modelos Teóricos , Estrutura Molecular , Análise de Sequência de DNA
11.
BMC Bioinformatics ; 19(1): 216, 2018 06 05.
Artigo em Inglês | MEDLINE | ID: mdl-29871588

RESUMO

BACKGROUND: Advances in sequencing technologies have facilitated large-scale comparative genomics based on whole genome sequencing. Constructing and investigating conserved genomic regions among multiple species (called synteny blocks) are essential in the comparative genomics. However, they require significant amounts of computational resources and time in addition to bioinformatics skills. Many web interfaces have been developed to make such tasks easier. However, these web interfaces cannot be customized for users who want to use their own set of genome sequences or definition of synteny blocks. RESULTS: To resolve this limitation, we present mySyntenyPortal, a stand-alone application package to construct websites for synteny block analyses by using users' own genome data. mySyntenyPortal provides both command line and web-based interfaces to build and manage websites for large-scale comparative genomic analyses. The websites can be also easily published and accessed by other users. To demonstrate the usability of mySyntenyPortal, we present an example study for building websites to compare genomes of three mammalian species (human, mouse, and cow) and show how they can be easily utilized to identify potential genes affected by genome rearrangements. CONCLUSIONS: mySyntenyPortal will contribute for extended comparative genomic analyses based on large-scale whole genome sequences by providing unique functionality to support the easy creation of interactive websites for synteny block analyses from user's own genome data.


Assuntos
Genômica/métodos , Software , Sintenia , Animais , Bovinos , Feminino , Genoma , Humanos , Internet , Camundongos , Sequenciamento Completo do Genoma
12.
Nucleic Acids Res ; 46(W1): W89-W94, 2018 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-29746660

RESUMO

Proteins perform biological functions through cascading interactions with each other by forming protein complexes. As a result, interactions among proteins, called protein-protein interactions (PPIs) are not completely free from selection constraint during evolution. Therefore, the identification and analysis of PPI changes during evolution can give us new insight into the evolution of functions. Although many algorithms, databases and websites have been developed to help the study of PPIs, most of them are limited to visualize the structure and features of PPIs in a chosen single species with limited functions in the visualization perspective. This leads to difficulties in the identification of different patterns of PPIs in different species and their functional consequences. To resolve these issues, we developed a web application, called INTER-Species Protein Interaction Analysis (INTERSPIA). Given a set of proteins of user's interest, INTERSPIA first discovers additional proteins that are functionally associated with the input proteins and searches for different patterns of PPIs in multiple species through a server-side pipeline, and second visualizes the dynamics of PPIs in multiple species using an easy-to-use web interface. INTERSPIA is freely available at http://bioinfo.konkuk.ac.kr/INTERSPIA/.


Assuntos
Biologia Computacional , Internet , Mapeamento de Interação de Proteínas/métodos , Software , Algoritmos , Bases de Dados de Proteínas , Proteínas/química , Proteínas/genética , Interface Usuário-Computador
13.
Sci Rep ; 7(1): 17303, 2017 12 11.
Artigo em Inglês | MEDLINE | ID: mdl-29230066

RESUMO

Rapid and cost effective production of large-scale genome data through next-generation sequencing has enabled population-level studies of various organisms to identify their genotypic differences and phenotypic consequences. This is also used to study indigenous animals with historical and economical values, although they are less studied than model organisms. The objective of this study was to perform functional and evolutionary analysis of Korean bob-tailed native dog Donggyeong with distinct tail and agility phenotype using whole-genome sequencing data by using population and comparative genomics approaches. Based on the uniqueness of non-synonymous single nucleotide polymorphisms obtained from next-generation sequencing data, Donggyeong dog-specific genes/proteins and their functions were identified by comparison with 12 other dog breeds and six other related species. These proteins were further divided into subpopulation-specific ones with different tail length and protein interaction-level signatures were investigated. Finally, the trajectory of shaping protein interactions of subpopulation-specific proteins during evolution was uncovered. This study expands our knowledge of Korean native dogs. Our results also provide a good example of using whole-genome sequencing data for population-level analysis in closely related species.


Assuntos
Biomarcadores/metabolismo , Evolução Molecular , Genoma , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Polimorfismo de Nucleotídeo Único , Cauda/fisiologia , Sequenciamento Completo do Genoma/métodos , Animais , Cães , Genótipo , Fenótipo , Filogenia , Mapas de Interação de Proteínas , Cauda/anatomia & histologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...